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Considering a discrete and finite statistical model of a general position we introduce an exact 
expression for the partition function in terms of a finite series. The leading term in the series is the 
Bethe-Peierls (Belief Propagation)-BP contribution, the rest are expressed as loop-contributions on 
the factor graph and calculated directly using the BP solution. The series unveils a small parameter 
that often makes the BP approximation so successful. Applications of the loop calculus in statistical 
physics and information science are discussed. 
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Discrete statistical models, the Ising model being the 
most famous example, play a prominent role in theoreti- 
cal and mathematical physics. They are typically defined 
on a lattice, and major efforts in the field focused pri- 
marily on the case of the infinite lattice size. Similar sta- 
tistical models emerge in information science. However, 
the most interesting questions there are related to graphs 
that are very different from a regular lattice. Moreover 
it is often important to consider large but finite graphs. 
Statistical models on graphs with long loops are of par- 
ticular interest in the fields of error-correction and combi- 
natorial optimization. These graphs are tree-like locally. 

A theoretical approach pioneered by Bethe [l] and 
Peierls ( see also |3|), who suggested to analyze statisti- 
cal models on perfect trees, has largely remained a useful 
efficiently solvable toy. Indeed, these models on trees are 
effectively one-dimensional, thus exactly-solvable in the 
theoretical sense, while computational effort scales lin- 
early with the generations number. The exact tree results 
have been extended to higher-dimensional lattices as un- 
controlled approximations. In spite of the absence of 
analytical control the Bcthc-Pcicrls approximation gives 
remarkably accurate results, often out-performing stan- 
dard mean-field results. The ad-hoc approach was also 
re-stated in a variational form 0, 0. Except for two 
recent papers 0,0 that will be discussed later in the let- 
ter, no systematic attempts to construct a regular theory 
with a well-defined small parameter and Bethe-Peierls as 
its leading approximation have been reported. 

A similar tree-based approach in information science 
has been developed by Gallager Q in the context of error- 
correction theory. Gallager introduced so called Low- 
Density-Parity-Check (LDPC) codes, defined on locally 
tree-like Tanner graphs. The problem of ideal decoding, 
i.e. restoring the most probable pre-image out of the ex- 
ponentially large pool of candidates, is identical to solv- 
ing a statistical model on the graph . An approximate 
yet efficient decoding Belief-Propagation algorithm intro- 
duced by Gallager constitutes an iterative solution of the 
Bethe-Peierls equations derived as if the statistical prob- 
lem was defined on a tree that locally represents with 



the Tanner graph. We utilize this abbreviation coinci- 
dence to call Bethe-Peierls and Belief-Propagation equa- 
tions by the same acronym - BP. Recent resurgence of 
interest to LDPC codes ^{j, as wei ^ as proliferation of 
the BP approach to other areas of information and com- 
puter science, e.g. artificial intelligence and com- 
binatorial optimization |12j . where interesting statistical 
models on graphs with long loops are also involved, posed 
the following questions. Why does BP perform so well 
on graphs with loops? What is the hidden small param- 
eter that ensures exceptional performance of BP? How 
can we systematically correct BP? This letter provides 
systematic answers to all these questions. 

The letter is organized as follows. We start with in- 
troducing notations for a generic statistical model, for- 
mulated in terms of interacting Ising variables with the 
network described via a factor graph. We next state our 
main result: a decomposition of the partition function 
of the model in a finite series. The BP expression for 
the model represents the first term in the series. All 
other terms correspond to closed undirected and possibly 
branching yet not terminating at a node subgraphs of the 
factor-graph, referred to as generalized loops. The sim- 
plest diagram is a single loop. An individual contribution 
is the product of local terms along a generalized loop, ex- 
pressed explicitly in terms of simple correlation functions 
calculated within the BP. We proceed with discussing the 
meaning of BP as a successful approximation in terms of 
the loop series followed by presenting a clear derivation of 
the loop series. The derivation includes three steps. We 
first introduce a family of local gauge transformations, 
two per an original Ising variable. The gauge transfor- 
mation changes individual terms in the expansion with 
the full expression for the partition function natually re- 
maining unchanged. We then fix the gauge in a way 
that only those terms that correspondent to generalized 
loops contribute to the modified series. Finally, we show 
that the first term in the resulting generalized loop series 
corresponds exactly to the standard BP approximation. 
This interprets BP as a special gauge choice. We con- 
clude with clarifying the relation of this work to other 
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recent advances in the subject, and discussing possible 
applications and generalizations of the approach. 

Vertex Model. Consider a generic discrete statistical 
model defined for an arbitrary finite undirected graph, T, 
with bits a,b = 1, . . . , m with the neighbors connected by 
edges, (a, 6), ... , the neighbor relation expressed as a G b 
or b G a. Configurations er, are characterized by sets 
of binary (spin) variables a a b = il, associated with the 
graph edges: <r = {a a b\ (a,b) G T}. The probability of 
configuration er is 



p(<r) = Z- 1 l[Ucr a ), Z = ^n/ Q «), (1) 

aer <t aer 



/o(f a ) being a non- negative function of er a a vector 
built of cr a b with 6 G a: er a = {(J a b',b G a}. The 
notation assumes o a & = Of, a . Our vertex model gen- 
eralizes the celebrated six- and eight-vertex models of 
Baxter 0. An example of a factor graph with m = 8 
that corresponds to p(ai, o 2 , 03, 174) = n o =i fai&a), 
where e^ ee (o 2 ,0 4 ,o 8 ), cr 2 = (oi,o 3 ), <T 3 EE (0-3,(74), 
CT 4 EE ((71,0-3,(75), <T 5 EE (o 4 ,0 6 ,0 8 ), cr 6 EE (05,07), 

0-7 ee (o 6 , o 8 ), cr 8 ee (01,05,07), is shown in Fig. HI 

Loop decomposition. The main exact result of the 
Letter is decomposition of the partition function defined 




FIG. 1: Example of a factor graph. Twelve possible marked 
paths (generalized loops) are shown in bold in the bottom. 



by Eq. Q in a finite series: 

/ n mc) \ 

I V 11 (1 ~m ab {C) 2 ) 
\ (a,6)ec / 

m ab (C) = ^o ah 6 afc (o afc ), (3) 

= ^ ]Q (o Q 6 - m ab )ba((Ta), (4) 
<r a bea.C 

where summation goes over all allowed (marked) paths 
C, or generalized loops. They consist of bits each with at 
least two distinct neighbors along the path. Twelve al- 
lowed marked paths for our example are shown in Fig. 
on the right. A generalized loop can be disconnected, 
e.g. the last one in the second raw shown in Fig. Q In 
Eqs. J2J) b a b{cr a b), b a (<r a ) and Z are beliefs (probabili- 
ties) defined on edges, bits, and the partition function, 
respectively, calculated within the BP. A BP solution can 
be interpreted as an exact solution in an infinite tree built 
by unwrapping the factor graph. A BP solution can be 
also interpreted as a set of beliefs that minimize the 
Bcthe free energy 

F=y ~y ]fto(o"a)fa j° ? a ' \ ~y~i y^b ab (a ab ) in 6 Qb (o ab ), 

' " ' Ja(Ca) f—f. 

a tT a (a,b)<?a,b 

under the set of realizability, < b a ((T a ),b a b((Jab) < 1, 
normalization, J2a a b a( cr a) = Yj aah ^ab{^db) = F and 
consistency J2a- a \a ab b a {<T a ) = b a b(a a b), constraints. The 
term associated with a marked path is the ratio of the 
products of irreducable correlation functions (0} and the 
quadratic magnetization at-edge functions calculated 
along the marked path C within the BP approximation. 

As usual in statistical mechanics exact expressions for 
the spin correlation functions can be obtained by differ- 
entiating Eq. with respect to the proper factor func- 
tions. In the tree (no loops) case only the unity term in 
the r.h.s. of Eq. J5J) survives. In the general case Eq. (0) 
provides a clear criterion for the BP approximation valid- 
ity: The sum over the loops in the r.h.s. of Eq. J2J should 
be small compared to one. The number of terms in the 
series increases exponentially with the number of bits. 
Therefore, Eq. becomes useful for selecting a smaller 
than exponential number of leading contributions. In 
a large system the leading contribution comes from the 
paths with the number of degree two connectivity nodes 
substantially exceeding the number of branching nodes, 
i.e. the ones with higher connectivity degree. Accord- 
ing to Eq. (0) the contribution of a long path is given 
by the ratio of the along-the-path product of the irre- 
ducible nearest-neighbor spin correlation functions asso- 
ciated with a bit, [i a to the along-the-path product of 
the edge contributions, 1/(1 — rn 2 ab ). All are calculated 
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within BP. Therefore, the small parameter in the pertur- 
bation theory is e = JJaec Mo(C)/ U( a ,b)eci l ~ m lb)- If 
e is much smaller than one for all marked paths the BP 
approximation is valid. We anticipate the loop formula 
(J2J to be extremely useful for analysis and possible differ- 
entiation between the loop contributions. Whether the 
series is dominated by a single loop contribution or some 
number of comparable loop correction, will depend on 
the problem specifics (form of the factor graph and func- 
tions). In the former case the leading correction to the 
BP result is given by the marked path with the largest e. 

Derivation of the loop formula. We relax the con- 
dition a a b = &ba in Eq. (JJJ and treat a a b and <7 oa as inde- 
pendent variables. This allows to represent the partition 
function in the form 



cr' a (b.c) 



1 + (Tbc&cb 



(5) 



where there are twice more components since any pair 
of variables a a b and a oa enters <x independently. It is 
also assumed in Eq. (J5J that each edge contributes to 
the product over (&, c) only once. The representation 
(JSJ) is advantageous over the original one since cr a at 
different bits become independent. We further introduce 
a parameter vector rj with independent components r\ a b 
(i.e., rjab 7^ Tjba)- Making use of the key identity 



cosh(ry bc + rj cb ){l + (JbcVcb) 



Vbc 



(cosh r\b c + Obc sinh r\bc) (cosh r\ c b + a c b sinh r] c b) 
V bc (V&c, o"cft) = l + (sinh(?7 hc + i] c b)-(J bc cosh(?7 fcc -|-?7 cb )) 
x (smh(r]bc+r]cb)-ecbCOsh(r)bc+r]cb)) , (6) 

we transform the product over edges on the rhs of Eq. JSJ) 
to arrive at: 

z=[ n2cosh(r/ bc +7 7ch ) nn^n^t 7 ) 



Ab,c) 



cr' a be 



Pa{cr a ) = fa(cr a ) Y[ (cosh^ab + o&a sinh?7 ab ) . (8) 

The desired decomposition Eq. is obtained by choos- 
ing some special values for the ^-variables (fixing the 
gauge !!) and expanding the V^-terms in Eq. (JJJ) in a 
series followed by a local computation (summations over 
a- variables at the edges). Individual contributions to the 
series are naturally identified with subgraphs of the origi- 
nal graph defined by a simple rule: Edge (a, b) belongs to 
the subgraph if the corresponding "vertex" V a b on the rhs 
of Eq. Q contributes using its second (non-unity) term, 
naturally defined according to Eq. ©. We next utilize 
the freedom in the choice of r/. The contributions that 
originate from subgraphs with loose ends vanish provided 
the following system of equations is satisfied: 



(tanh(?7 Qf) + i] ba ) - Vba) Pa{cr a ) = 0. 



(9) 



The number of equations is exactly equal to the number 
of 77 variables. Moreover, Eqs. © are nothing but BP 
equations: simple algebraic manipulations (see for 
details) allow to recast Eq. in a more traditional BP 
form 



Y, a Vabfa{<7a) EIcGa ( C0Sn + <? ac SmllTfcc) 



tanh r\ ha - 



J2cr fa{<Ta) iTctl ( C0Sn Vac + O ac Smh r\ a 



with the relation between the beliefs that minimize the 
Bethe free energy T and the rj fields according to: 



Pg(Ta) 



The final expression Eq. J2J emerges as a result of direct 
expansion of the V term in Eq. (J5J , performing summa- 
tions over local a- variables, making use of Eqs. (l.'il II) . and 
also identifying the BP expression for the partition func- 
tion as 



Z 



ri(b,c) 2 C0Sn (Vbc + Vcb) ' 



To summarize, Eq. |J2J represents a finite series where all 
individual contributions are related to the correspond- 
ing generalized loops. This fine feature is achieved via 
a special selection of the BP gauge (jHJl. The condition 
enforces the "no loose ends" rule thus prohibiting any- 
thing but generalized loop contributions to Eq. J3J). Any 
individual contribution is expressed explicitly in terms of 
the BP solution. 

Comments, Conclusions and Path Forward. We 
expect that BP equations may have multiple solutions 
for the model with loops. This expectation naturally 
follows from the notion of the infinite covering graph, 
as different BP solutions correspond to different ways to 
spontaneously break symmetry on the infinite structure. 
This different BP solutions will generate loop series (0) 
that are different term by term but give the same result 
for the sum. Finding the "optimal" BP solution with 
the smallest e, characterizing loop correction to the BP 
solution, is important for applications. A solution related 
to the absolute minimum of the Bethe free energy would 
be a natural candidate. However, one cannot guarantee 
that the absolute minimum, as opposed to other local 
minima of T is always "optimal" for arbitrary f a . 

We further briefly discuss other models related to 
the general one discussed in the paper. The vertex 
model can be considered on a graph of the special ori- 
cnted/biparitite type. A bipartite graph contains two 
families of nodes, referred to as bits and checks, so that 
the neighbor relations occur only between the nodes from 
opposite families. A bipartite factor-graph model with 
an additional property that any factor associated with a 
bit is nonzero only if all Ising variables at the neighbor- 
ing edges are the same, leads to the factor-graph model 
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considered in 0. Actually, this factorization condition 
means re- assignment of the Ising variables, defined at the 
edges of the original vertex model, to the corresponding 
bits of the bipartite factor-graph model. Furthermore, if 
only checks of degree two (each connected to only two 
bits) are considered, the bipartite factor graph model is 
reduced to the standard binary-interaction Ising model. 
The loop series derived in this Letter is obviously valid 
for all less general aforementioned models. Also note that 
the bipartite factor graph model was chosen in [lii j to in- 
troduce an alternative derivation of the loop series via 
an integral representation, where BP corresponds to the 
saddle-point approximation for the resulting integral. 

Let us now comment on two relevant papers 0,0]- The 
Ising model on a graph with loops has been considered by 
Montanari and Rizzo , where a set of exact equations 
has been derived that relates the correlation functions to 
each other. This system of equations is under-defined, 
however, if irreducible correlations are neglected the BP 
result is restored. This feature has been used |6j to gen- 
erate a perturbative expansion for corrections to BP in 
terms of irreducible correlations. A complementary ap- 
proach for the Ising model on a lattice has been taken 
by Parisi and Slanina 0, who utilized an integral rep- 
resentation developed by Efetov ^ij. The saddle-point 
for the integral representation used in turns out to 
be exactly the BP solution. Calculating perturbative 
corrections to magnetization, the authors of encoun- 
tered divergences in their representation for the partition 
function, however, the divergences cancelled out from the 
leading order correction to the magnetization revealing a 
sensible loop correction to BP. These papers, |f| and [7J , 
became important initial steps towards calculating and 
understanding loop corrections to BP. However, both ap- 
proaches are very far from being complete and problems- 
free. Thus, lacks an invariant representation in terms 
of the partition function, and requires operating with cor- 
relation functions instead. Besides, the complexity of the 
equations related to the higher-order corrections rapidly 
grows with the order. The complimentary approach of 
7] contains dangerous, since lacking analytical control, 
divergences (zero modes), which constitutes a very prob- 
lematic symptom for any field theory. Both and 
focus on the Ising pair-wise interaction model. The ex- 
tensions of the proposed methods to the most interesting 
from the information theory viewpoint multi-bit inter- 
action cases do not look straightforward. Finally, the 
approaches of and , if extended to higher-order cor- 
rections, will result in infinite series. Re-summing the 
corrections in all orders, so that the result is presented 
in terms of a finite series, does not look feasible within 
the proposed techniques. 

We conclude with a discussion of possible applications 



and generalizations. We see a major utility for Eq. J3J in 
its direct application to the models without short loops. 
In this case Eq. (J2J) constitutes an efficient tool for im- 
proving BP through accounting for the shortest loop cor- 
rections first and then moving gradually (up to the point 
when complexity is still feasible) to account for longer 
and longer loops. Another application of Eq. J5J) is di- 
rect use of e as a test parameter for the BP approxima- 
tion validity: If the shortest loop corrections to BP are 
not small one should either look for another solution of 
BP (hoping that the loop correction will be small within 
the corresponding loop series) or conclude that no feasi- 
ble BP solution, resulting in a small e, can be used as 
a valid approximation. There is also a strong general- 
ization potential here. If a problem is multi-scale with 
both short and long loops present in the factor graph, a 
development of a synthetic approach combining General- 
ized Belief Propagation approach of (that is efficient 
in accounting for local correlations) and a corresponding 
version of Eq. J5J) can be beneficial. Finally, our approach 
can be also useful for analysis of standard (for statistical 
physics and field theory) lattice problems. A particularly 
interesting direction will be to use Eq. J2J for introduc- 
ing a new form of resummation of different scales. This 
can be applied for analysis of the lattice models at the 
critical point where correlations are long-range. 
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